Software Prefetching for Indirect Memory Accesses
نویسندگان
چکیده
منابع مشابه
Improving Memory Performance for Indirect Accesses on SIMD Computers
SIMD machines operate more efficiently on a wider range of problems when they have the ability to access memory with both global and local addresses. Recent work has made possible the use of caches for global addresses. This paper examines techniques for employing caches to improve memory accesses with local addresses. Specifically, we examine the improvement from utilizing a clusterbased indir...
متن کاملPage Rank Prefetching for Optimzing Accesses to Web Page Clusters
This paper presents a Page Rank based prefetching technique for accesses to web page clusters. The approach uses the link structure of a requested page to determine the “most important” linked pages and to identify the page(s) to be prefetched. The underlying premise of our approach is that in the case of cluster accesses, the next pages requested by users of the web server are typically based ...
متن کاملA Prefetching Technique for Irregular Accesses to Linked Data Structures
Prefetching offers the potential to improve the performance of linked data structure (LDS) traversals. However, previously proposed prefetching methods only work well when there is enough work processing a node that the prefetch latency can be hidden, or when the LDS is long enough and the traversal path is known a priori. This paper presents a prefetching technique called prefetch arrays which...
متن کاملSoftware Data Prefetching for Software Pipelined Loops
This paper focuses on the interaction between software prefetching (both binding and nonbinding prefetch) and software pipelining for statically-scheduled machines. First, it is shown that evaluating software pipelined schedules without considering memory effects can be rather inaccurate due to stalls caused by dependences with memory instructions (even if a lockup-free cache is considered). It...
متن کاملTolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors
The large latency of memory accesses is a major obstacle in obtaining high processor utilization in large scale shared-memory multiprocessors. Although the provision of coherent caches in many recent machines has alleviated the problem somewhat, cache misses still occur frequently enough that they significantly lower performance. In this paper we evaluate the effectiveness of non-binding softwa...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: ACM Transactions on Computer Systems
سال: 2019
ISSN: 0734-2071,1557-7333
DOI: 10.1145/3319393